# Low VRAM Requirement
Deepseek R1 0528 FP4
MIT
A quantized version of the DeepSeek R1 0528 model from DeepSeek AI, an autoregressive language model based on an optimized Transformer architecture, which can be used for commercial and non-commercial purposes.
Large Language Model
Safetensors
D
nvidia
372
17
Deepseek R1 0528 Quantized.w4a16
MIT
The DeepSeek-R1-0528 model after quantization processing significantly reduces the requirements for GPU memory and disk space by quantizing the weights to the INT4 data type.
Large Language Model
Safetensors
D
RedHatAI
126
3
Wan2.1 VACE 1.3B
Apache-2.0
Wan2.1 is an open and advanced foundational model for video generation, supporting various video generation and editing tasks.
Text-to-Video Supports Multiple Languages
W
Wan-AI
1,520
44
Stable Diffusion 3.5 Large DF11
A losslessly compressed version of stabilityai/stable-diffusion-3.5-large using DFloat11 format, reducing size by 30% while maintaining 100% accuracy
Image Generation
S
DFloat11
855
2
Qwen3 14B FP8 Dynamic
Apache-2.0
Qwen3-14B-FP8-dynamic is an optimized large language model. By quantizing activation values and weights to the FP8 data type, it effectively reduces GPU memory requirements and improves computational throughput.
Large Language Model
Transformers

Q
RedHatAI
167
1
Wan2.1 T2V 14B
Apache-2.0
Wan2.1 is an open and advanced large-scale video generation model with top-tier performance, capable of running on consumer-grade GPUs and excelling in multitask processing.
Text-to-Video Supports Multiple Languages
W
wan-community
17
0
Deepcoder 14B Preview Exl2
DeepCoder-14B-Preview is a code generation model developed based on DeepSeek-R1-Distill-Qwen-14B, focusing on solving verifiable programming problems.
Large Language Model English
D
cgus
46
2
Lumina Gguf
The GGUF quantized version of Lumina is a model specifically designed for generating high-quality images, supporting text-prompt-based generation of highly matched images.
Image Generation
L
calcuis
627
11
Deepseek R1 Distill Qwen 32B Quantized.w8a8
MIT
Quantized version of DeepSeek-R1-Distill-Qwen-32B, reducing memory requirements and improving computational efficiency through INT8 weight quantization and activation quantization
Large Language Model
Transformers

D
RedHatAI
3,572
11
Deepseek R1 Distill Llama 70B FP8 Dynamic
MIT
The FP8 quantized version of DeepSeek-R1-Distill-Llama-70B, which optimizes inference performance by reducing the number of bits of weights and activations.
Large Language Model
Transformers

D
RedHatAI
45.77k
9
Pixart
Quantized version based on PixArt-alpha/PixArt-XL-2-1024-MS, supporting efficient text-to-image tasks
Image Generation English
P
calcuis
459
2
Sd3.5 Medium Gguf
Other
The GGUF quantized version of Stable Diffusion 3.5 Medium, suitable for text-to-image tasks and capable of running on legacy devices.
Image Generation English
S
calcuis
3,232
13
Sd3.5 Large Turbo
Other
The GGUF quantized version of Stable Diffusion 3.5 Large Turbo, suitable for image generation tasks, providing more efficient runtime performance.
Text-to-Image English
S
calcuis
108
5
Hands XL
This is a text-to-image generation model combining Hands XL, SD 1.5, and FLUX.1-dev technologies, focusing on high-quality image generation.
Image Generation
H
xyy1551308532
27
2
Llama 3.1 8B Instruct FP8
FP8 quantized version of Meta Llama 3.1 8B Instruct model, featuring an optimized transformer architecture autoregressive language model with 128K context length support.
Large Language Model
Transformers

L
nvidia
3,700
21
FLUX.1 Dev Qint4
Other
FLUX.1-dev is a text-to-image generation model quantized to INT4 format using Optimum Quanto, suitable for non-commercial use.
Text-to-Image English
F
Disty0
455
12
Meta Llama 3.1 8B Instruct Quantized.w4a16
A quantized version of Meta-Llama-3.1-8B-Instruct, optimized to reduce disk space and GPU memory requirements, suitable for chat assistant scenarios in English business and research.
Large Language Model
Transformers Supports Multiple Languages

M
RedHatAI
27.51k
28
Meta Llama 3.1 8B Instruct GPTQ INT4
This is the INT4 quantized version of the Meta-Llama-3.1-8B-Instruct model, quantized using the GPTQ algorithm, suitable for multilingual dialogue scenarios.
Large Language Model
Transformers Supports Multiple Languages

M
hugging-quants
128.18k
25
Deepseek Coder V2 Lite Instruct FP8
Other
FP8 quantized version of DeepSeek-Coder-V2-Lite-Instruct, suitable for commercial and research use in English, optimized for inference efficiency.
Large Language Model
Transformers

D
RedHatAI
11.29k
7
Mapo Beta
MaPO is a reference-free, energy-efficient, and memory-friendly alignment method for text-to-image diffusion models
Text-to-Image
M
mapo-t2i
30
6
Koala Lightning 700m
KOALA-Lightning-700M is an efficient text-to-image model trained through knowledge distillation based on SDXL-Lightning, significantly improving inference speed while maintaining generation quality
Image Generation
K
etri-vilab
170
6
Llama 2 13B Fp16 French
Apache-2.0
A French Q&A model fine-tuned based on Llama-2-13b-chat, supporting tasks like Baroque style text generation
Large Language Model Supports Multiple Languages
L
Nekochu
79
11
Featured Recommended AI Models